Efficient reinforcement learning: model-based Acrobot control
نویسنده
چکیده
|Several methods have been proposed in the reinforcement learning literature for learning optimal policies for sequential decision tasks. Q-learning is a model-free algorithm that has recently been applied to the Acrobot, a two-link arm with a single actuator at the elbow that learns to swing its free endpoint above a target height. However, applying Q-learning to a real Acrobot may be impractical due to the large number of required movements of the real robot as the controller learns. This paper explores the planning speed and data eeciency of explicitly learning models, as well as using heuristic knowledge to aid the search for solutions and reduce the amount of data required from the real robot.
منابع مشابه
High-accuracy value-function approximation with neural networks applied to the acrobot
Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning algorithm used was model-based continuous TD(λ). It generated an efficient controller, producing a high-accuracy ...
متن کاملApplication of reinforcement learning to balancing of acrobot
The acrobot is a two-link robot, actuated only at the joint between the two links. It is one of dicult tasks in reinforcement learning (RL) to control the acrobot because it has nonlinear dynamics and continuous state and action spaces. In this article, we discuss applying the RL to the task of balancing control of the acrobot. Our RL method has an architecture similar to the actor-critic. The ...
متن کاملUsing BELBIC based optimal controller for omni-directional threewheel robots model identified by LOLIMOT
In this paper, an intelligent controller is applied to control omni-directional robots motion. First, the dynamics of the three wheel robots, as a nonlinear plant with considerable uncertainties, is identified using an efficient algorithm of training, named LoLiMoT. Then, an intelligent controller based on brain emotional learning algorithm is applied to the identified model. This emotional l...
متن کاملApplication of reinforcement learning based on on-line EM algorithm to balancing of acrobot
متن کامل
Representation Discovery for Kernel-Based Reinforcement Learning
Recent years have seen increased interest in non-parametric reinforcement learning. There are now practical kernel-based algorithms for approximating value functions; however, kernel regression requires that the underlying function being approximated be smooth on its domain. Few problems of interest satisfy this requirement in their natural representation. In this paper we define value-consiste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997